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An electrophysiological investigation of the temporal asynchrony effect on character- 
speech sound integration in Chinese typically developing children and children with 


dyslexia 


Research highlights 

1. The present study found different neural correlates of character-speech sound 
integration in the temporal synchrony and temporal asynchrony conditions. 

2. Congruency effects were observed on Pl, N170 and N300 components when 
visual characters were presented synchronously with auditory speech sounds. 

3. Congruency effects were observed on N200 component when auditory speech 
sounds were presented after visual characters by 300 or 600 ms. 

4. Dyslexic children showed insufficient character-speech integration in the AV300 
condition. Character-speech sound integration deficit in dyslexic readers was 


modulated by stimulus onset asynchrony. 


Abstract 

The neural mechanism of orthographic-phonological integration was influenced 
by the temporal relationship of cross-modal stimuli. However, previous studies 
mainly investigated the neural mechanism of letter-speech sound integration in 
precise temporal synchrony or small temporal asynchrony conditions. In this study, 
character-speech sound integration was investigated in a relatively wide temporal 
window. Chinese characters were presented synchronously to the onset of speech 
sounds or before speech sound by 300 or 600 ms (referred as AVO, AV300 and 
AV600). ERP responses evoked by congruent condition (speech sounds were paired 
with congruent visual characters) and baseline condition (speech sounds were paired 
with Korean characters) were compared. Different electrophysiological markers were 
found in the temporal synchrony and temporal asynchrony conditions. In the AVO 
condition, developing dyslexia (DD) and typically developing (TD) children showed 
similar congruency effect on Pl, N170 and N300 components, demonstrating the 
influence of speech sound on visual character processing. In the AV300 condition, DD 
group showed left-lateralized congruency effect on N200, whereas TD children 
showed bilateral congruency effect on N200. Both groups showed bilateral 
congruency effect on N200 in the AV600 condition. We speculate that the insufficient 
character-speech sound integration exhibited by dyslexic children in the AV300 
condition was probably caused by their slow visual processing speed. The results 
provide unique insight into the neural mechanism of print-speech integration in a wide 
temporal window and point out the necessity to investigate neural mechanism of 


print-speech integration in a relatively wide temporal window. 
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asynchrony, event-related potentials (ERP), neural correlates, developmental dyslexia 


1. Introduction 

Reading is essential to children’s academic achievement, and is associated with 
future social and economic success in the literate culture. Although most people 
acquire literacy skills easily, around 5-10% of children are affected by developmental 
dyslexia and suffer from different degrees of reading difficulties (Lyon, Shaywitz, & 
Shaywitz, 2003). Developmental dyslexia (DD) is a learning disorder characterized by 
difficulties in accurately and/or fluently reading in spite of an average IQ, adequate 
schooling, and normal sensory acuity (Lyon, Shaywitz, & Shaywitz, 2003). The 
acquisition of grapheme-phoneme associations is one of the basic requirements for 
fluent reading and its failure may contribute to reading difficulties in developmental 
dyslexia (Zari 

et al., 2014; Wimmer & Schurz, 2010; Blomert, 2011; Wallace & Stevenson, 
2014; Hahn, Foxe, & Molholm, 2014). Many dyslexia interventions contain a 
condition focused on letter-speech sound mapping training (Aravena, Snellings, 
Tijms, &van der Molen, 2013; Tijms & Hoeks, 2010). However, the nature and 
mechanisms involved in orthographic-phonological integration are still unclear 
(Mercier & Cappe, 2020). 

One critical factor affecting the success of audiovisual integration is temporal 
proximity (Calvert, Spence, & Stein, 2004). In an fMRI study, van Atteveldt, 
Formisano, Blomert and Goebel (2007) investigated the effect of temporal 
asynchrony on letter-speech sound integration. In their study, a heteromodal area in 
superior temporal sulsuc (STS) and a modality-specific auditory association cortex 
were identified to be involved in letter-speech sound integration. The STS receives 
visual information and auditory information. Visual letters influenced the processing 


of speech sounds by sending feedback from STS to the auditory association cortex 


(van Atteveldt, Formisano, Blomert & Goebel, 2007). Most interestingly, integration 
(stronger BOLD response to congruent letter-sound pairs than to incongruent letter- 
sound pairs) in the left STS occurred within a relatively wide temporal window (about 
300 ms), whereas integration in auditory association areas depended on temporal 
synchrony. The findings provided evidence for the influence of temporal asynchrony 
on letter-speech sound integration. 

Although fMRI studies provide insight about precise localization of the neural 
mechanisms involved in letter-speech sound integration, they cannot provide precise 
temporal information of the processing. Precise neural time course of letter-speech 
sound integration can only be investigated with a high temporal resolution method 
such as event-related potentials (ERP). To date, only a few studies have investigated 
the effect of temporal asynchrony on letter-speech sound integration using ERP 
technique. Most of the studies used the mismatch negativity (MMN) as an index of 
automatic letter-speech integration. The MMN is evoked between 100 and 200 ms 
after stimulus onset and is considered to reflect the neurophysiological correlate of a 
comparison process between an incoming deviant auditory stimulus and the memory 
trace of repeatedly presented standard auditory stimulus (Schröger, 1998). 

Froyen, Van Atteveldt, Bonte and Blomert (2008) recorded EEG data while 
healthy Dutch adult readers completed a passive oddball task in which speech sounds 
(standard, 90%; deviant, 10%), letter/speech sound pairs (the visually presented letter 
was always an “a”, irrespective of the speech sound) were presented. The stimulus 
onset asynchronies (SOAs) between the presentation of the letter and speech sound 
were manipulated, including synchronous condition (AVO) and two visual-leading 
auditory conditions (100 or 200 ms, referred to as AV100 or AV200 from here on). 


Compared with the MMN in isolated auditory condition, larger MMN amplitude 


(reflecting integration) in audiovisual condition was only found in the AVO condition. 
Likewise, Mittag and colleagues (2011, 2013) found enhanced MMN amplitude in 
healthy adult readers in the AVO condition but not in the AV200 condition. The results 
in children were slightly different. When investigated SOA effects on letter-speech 
sound integration in typically developing (TD) children, Froyen, Bonte, Van Atteveldt 
and Blomert (2009) did not find enhanced MMN amplitude in the AVO condition, but 
found enhance MMN amplitude in the AV200 condition. The authors explained the 
MMN as reflecting a mature and automatic cross-modal integration. The absence of 
enhanced MMN in the AVO condition indicated that automatic print-speech 
integration has not been fully developed in the children. Employing an adapted 
version of the oddball paradigm of Froyen and colleagues (2008, 2009, 2011), Zari 

and colleges (2014) examined SOA effects on letter-speech sound integration in 
three groups of children (typical readers, dysfluent dyslexics, and severe dysfluent 
dyslexics). In the AVO condition, enhanced MMN amplitudes were observed both in 
typical readers and in dysfluent dyslexics. In the AV200 condition, enhanced MMN 
amplitudes were found in all groups. The authors attributed the different findings 
between their study and studies by Froyen and colleagues (2009, 2011) to the 
methodological differences such as proportion of deviants (increased from 10% 
or15% to 17%) and trial length (increased from 1250 ms to 1700 ms) (Zari 

et al., 2014). 

Taken together, the aforementioned studies reveal interactions between temporal 
proximity and experimental parameters in the MMN integration effect, indicating that 
temporal synchrony or small temporal asynchrony is critical for the integration effect 
reflected by MMN. According to brain imaging studies, the major MMN source is 


located in the early auditory cortex (Alho, 1995; Giard, Perrin, Pernier & Bouchet, 


2010), including the auditory association cortex which is reported to be dependent on 
temporal synchrony (van Atteveldt, Formisano, Blomert & Goebel, 2007). The 
dependence of temporal synchrony in MMN integration effects observed in the ERP 
studies resembles that in the auditory association cortex observed in the fMRI study 
by van Atteveldt, Formisano, Blomert and Goebel (2007). In that case, it is reasonable 
to speculate that MMN is not suitable for investigating the neural mechanism of 
multisensory integration at a relatively wide temporal window. However, letter-speech 
sound integration can occur at relatively long temporal asynchronies (Nash et al, 
2016; Clayton & Hulme, 2017; Hasko, Bruder, Bartling & Schulte-Körne, 2012). It is 
necessary to describe a whole temporal profile for the neural mechanisms of letter- 
speech integration using other electrophysiological markers. Moreover, letter-speech 
integration includes multiple processing stages. In the fMRI study by van Atteveldt, 
Formisano, Blomert and Goebel (2007) and the ERP studies by Froyen and colleagues 
(2008, 2009, 2011), the neural correlates of integration only reflected the influence of 
visual letter on speech processing. The influence of speech sound on visual stimuli 
was not illustrated. In the study by Froyen, van Atteveldt and Blomert (2010), visual 
MMN was used to investigate the influence of speech sound on letter processing but 
no visual MMN amplitude modulation was found. Exploring letter-speech sound 
integration with other electrophysiological signatures provides an opportunity to 
reveal both sides of letter-speech sound integration. 

Under these circumstances, the first aim of the current study was to explore the 
neural mechanisms of print-speech integration across a wide range of temporal 
asynchrony using electrophysiological correlates that are less dependent on temporal 
synchrony. Compared with typical readers, dyslexic readers have been reported to 


show broader temporal integration window for both speech and non-speech stimuli 


(Laasonen, Service, & Virsu, 2002; Virsu, Lahti-Nuuttila, & Laasonen, 2003; Hairston 
et al., 2005; Wallace & Stevenson, 2014; Meilleur, Foster, Coll, Brambati, & Hyde, 
2020). The second aim of this study was to examine whether dyslexic children exhibit 
different neural mechanisms of letter-speech sound integration from typically 
developing children. To this end, Chinese dyslexic children and nondyslexic children 
were recruited as participants. In the present study, three levels of SOA were included: 
synchronous condition (AVO), visual leading auditory stimuli by 300 ms (AV300), and 
visual leading auditory stimuli by 600 ms (AV600). We chose 0 and 300 ms SOA 
since these SOAs have been proved to be effective and used in the study by van 
Atteveldt, Formisano, Blomert and Goebel (2007). We chose the 600 ms SOA because 
it is close to the 500 ms SOA used in the study by Clayton and Hulme (2017) and is 
shorter than the 1000 ms SOA used in the study by Nash et al. (2016). High temporal 
resolution event-related potentials (ERPs) were used to measure the real time print- 
speech integration processing in all SOA conditions. ERP waveforms induced by the 


onset of auditory stimuli were analyzed. 


2. Method 
2.1. Participants 

Thirty children between the ages of 9 and 11 were recruited from primary 
schools in Beijing. The children were in grade 3 to 6. There were 14 children with 
developing dyslexia (DD: 4 females and 10 males, mean age was 10.10 years) and 16 
typically developing children as controls (TD: 5 females and 11 males, mean age was 
10.29 years). All participants were right-handed with Chinese as their native language, 
had normal or corrected-to-normal vision, and did not report a history of neurological 


or psychiatric disorders. Two participants in the control group and one in the dyslexic 


10 


group were excluded in the following ERP analysis because of the excessive artifacts. 
This study was approved by the Ethics Committee of the Institute of Psychology, 
Chinese Academy of Science. Prior to the experiment, written informed consent was 
obtained from the guardians of the children. 

The inclusion criteria for selecting children with dyslexia were adopted from the 
studies conducted with Mandarin-speaking Chinese participants (Meng, Cheng-Lai, 
Zeng, Stein, & Zhou, 2011; Wang, Bi, Gao, & Wydell, 2010). Intelligence quotient 
(IQ) and vocabulary size were assessed by Combined Raven’s Progressive Matrices 
Test (CRT) (Li, Chen, & Jin, 1989) and Chinese Character Recognition Test (CCRT) 
(Wang & Tao, 1996), respectively. In Combined Raven’s Test, nonverbal intelligence 
was tested by a series of matrices with increasing difficulty. In Chinese Character 
Recognition Test, 210 characters are divided into 10 sub-groups according to levels of 


difficulty. The participants were required to write down a compound word using a 
given Chinese character. For example, if they see the character 7 (grass), they should 


write down a compound word such as HH} (grassland), FF (grass-plot), ect. The 
raw score for each sub-group multiplied a coefficient corresponding to the difficulty 
level, and then the scores for the ten sub-groups were summed up for each participant. 
Those individuals with normal IQ (IQ > 85) and low score in CCRT (at least one and 
a half standard deviations below the average score of typically developing children in 
the same grade) were recruited as dyslexic participants in this study. 
2.2. Reading-related Measures 

Prior to the ERP experiment, both groups of children participated in a series of 
linguistic-cognitive tests, including reading fluency, reading accuracy, morphological 
awareness, phonological awareness, rapid naming, short-term memory, and visual 


spatial attention. The detailed information of the participants is presented in Table 1. 
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Reading fluency test 

This test was developed by Qian, Deng, Zhao, and Bi (2015) which includes a 
fixed list of 160 high frequency (233.71 occurrence per million) Chinese characters 
that were selected from the Modern Chinese Frequency Dictionary (1986). The 
children were required to read aloud each character as fluently and accurately as they 
can within 1 min. The number of correctly read words was recorded to represent their 
reading fluency score. 
Reading accuracy test 

A list of 172 characters with increasing difficulty (i.e., increasing stroke numbers 
and decreasing word frequency) was presented on a paper. Participants were asked to 
read each character aloud in sequence. When the participant failed to correctly read 
five characters consecutively, the test was terminated. Participants were scored on the 
total number of correctly read characters. 
Morphological awareness test 

A morpheme judgment task similar to that in Shu, Mcbride-Chang, Wu, and Liu 


(2006) was used in this study. The test contains 20 pairs of two-character Chinese 


words. Each pair of words shared one character (e.g., F vs. fF, meaning level 
vs. wait). Participants were asked to judge whether the identical characters in the pairs 
had the same meaning, i.e., whether the identical characters represent the same 
morpheme or not. One point was scored for each correct judgment. 
Phonological awareness task 

Phonological awareness was measured by an oddball paradigm adopted from 
Qian and Bi (2015). There were three subtests corresponding to three types of oddity, 
which were onset, rime and lexical tone respectively. There were ten trials for each 


type of oddity. In each trial, three Chinese characters were presented orally by the 
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experimenter. Participants were required to identify the phonologically odd item 
among them. One point was scored for each correct judgment. 
Rapid automatized naming task (RAN) 

This task was adapted from Denckla and Rudel (1976). RAN was measured by 
two subtests including picture naming and digit naming. Five pictures (e.g., flower, 
book, dog, hand, shoe) or five digits (e.g., 2, 4, 6, 7, 9) were presented repeatedly in 
random order in a six row*five column grid. The children were instructed to name 
each picture or digit in sequence as quickly as possible. All items were read twice. 
Naming latencies were recorded with a stopwatch and the final score was the average 
time of the two readings. The higher score means the slower speed of naming. 
Short-term memory task 

The WAIS- III (Wechsler Adults Intelligence Scale version HI, Wechsler, 1997) 
subtest was used. The test has two parts: forward memory and backward memory. In 
forward memory subtest, the children were presented with a list of numbers and then 
asked to repeat them sequentially. In backward memory subtest, the children were 
presented with a list of numbers and required to repeat the numbers in reverse order. 
The number of test items increases gradually. When the children made two 
consecutive mistakes, the test will stop. The largest correct number was taken as the 
final score. 

Visual spatial attention 

A digit cancellation task was used to test the visual spatial attention (see Takeshi, 
Kazuhito, Yasuhiro, Mitsuhito, & Hidehiro, 2013). Numbers from 0 to 9 were 
repeatedly presented in a 25 rows x 40 columns matrix. Participants were instructed 
to search through the matrix and deleted the number “3” sequentially in three minutes. 


The visual spatial attention ability is measured as ‘correct number — (false number + 
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0.5 x missed number)’. The higher score means the better visual spatial attention 
ability. 
2.3. Stimuli and Procedure 

A passive priming task adapted from the study by Nash and colleagues (2016) 
was employed in this experiment. There were six experimental conditions: bimodal 
congruent at 3 SOAs (AV0, AV300, and AV600) and bimodal baseline at 3 SOAs. In 
the congruent condition, the prime was a visually presented Chinese single character 
and the target was the corresponding speech sound. In the baseline condition, the 
prime was a visually presented Korean character and the target was a Chinese speech 
sound. Speech sounds were recorded by a female native Chinese speaker and digitally 
converted to 16-bit resolution at a sampling rate of 44.1 kHz. The mean duration of 
speech sound was 426.04 ms (SD=111.13). Visual stimuli consist of 80 Chinese single 
characters and 80 stroke-matched Korean characters. Korean characters were used for 
non-word symbols as they are unfamiliar to the children in this study. 

The priming task was run using E-prime version 2.0 (Schneider, et al. 2002). A 
schematic description of stimuli presentation is shown in Figure 1. At the beginning of 
each trial, a central fixation was presented for 500 ms. Next a black visual character 
appeared on a white screen for 1000 ms. The characters were written in Song font size 
72. They were presented on a 17 inch screen and the children sat 70 cm from the 
screen. The auditory stimuli were presented over headphones simultaneously with the 
visual characters or after a 300 ms or 600 ms time delay. A blank screen with a 
random interval from 500 to 1000 ms was presented after the offset of the visual 
characters. Participants were instructed to look at the characters and listen to the 
sounds. Catch trials were included to ensure that the children were attending to the 


stimuli during the task. In the catch trial, children were required to press the spacebar 
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when they saw a yellow cartoon picture. There are 5 blocks in the formal experiment, 
each block contains 96 experimental trials and 16 catch trials. There are 560 trials in 
total. The duration of formal experiment is about 30 min (break time was not 
included). Before formal experiment started, a practice session including 20 trials was 
presented to the children to familiarize with the experimental procedure. 

2.4. EEG recording and ERP analysis 

Using 64-channel electrode cap (NeuroScan II Inc., EI Paso, TX, USA), EEG 
data were collected in a quiet and dimly-light room. Electrodes were positioned 
according to the Extended International 10-20 System. The channels were referenced 
to left mastoid. In addition, four bipolar electrodes were used to record vertical and 
horizontal eye blinks. EEG data were sampled at 1000Hz and bandpass filtered from 
0.05 Hz to 100 Hz online. Electrode impedance levels were kept below 5 kQ. 

The raw data were processed offline using EEGLAB (Delorme and Makeig, 
2003; http://www.sccn.ucsd.edu/eeglab). Data were re-referenced to the average of the 
64 electrodes and bandpass filtered from 0.1 Hz to 40 Hz. An independent component 
analysis (ICA) was run on the pre-processed data in order to reduce artifacts caused 
by eye movement or body motion. Then the EEG data were segmented into epochs 
that began from 100 ms before the onset of the auditory stimuli and ended 700 ms 
after the onset of the auditory stimuli. The 100 ms interval before stimuli onset was 
used as baseline. Epochs were rejected if amplitudes exceeded + 100 uV. The mean 
number of accepted epochs for each condition and group are illustrated in Table 2. 
Epochs were averaged for each condition and each group. Catch trials were not 
analyzed. The layout for configuration of electrodes and selection of electrodes for 


data analysis were shown in Figure 2. Mixed-design ANOVAs with Greenhouse- 
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Geisser correction were applied to analyze the ERP data. The alpha level for all 


analyses was 0.05. 


3. Results 


3.1 Synchronous audiovisual condition (AV0) 


Grand-average ERPs in response to the simultaneous onset of visual and auditory 
stimuli for all conditions and groups are plotted in Figure 3 for representative 
electrodes. The occipito-temporal P1, N170, and fronto-temporal N300 were observed 


in the AVO condition. 


PI 


Peak latency was measured in the time window of 100-180 ms and averaged 
amplitude was measured in a 40 ms time window centered on the peak latency. 
Mixed-design ANOVAs with Greenhouse-Geisser correction were performed on the 
latency and amplitude of P1, with group as between-subject factor, and trial type 
(congruent, baseline), laterality (left hemisphere: P7, PO7, POS, and right 
hemisphere: P8, PO8, PO6) as within-subject factors. For the P1 latency, the 
interaction between trial type and laterality was significant (F(1,25) = 7.36, p = .012, 
n° = .227). Pl latency was marginally longer in congruent condition in comparison to 
baseline condition on the left hemisphere (141 vs. 134 ms, 1(26) = 2.01, p = .055). No 
difference was found in the P1 latencies between the two conditions on the right 
hemisphere (138 vs. 139 ms, 1(26) = -0.496, p = .624). No significant effect was found 
for the Pl amplitude. 


N170 
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The N170 peak latency was measured in the time window of 180-250 ms, mean 
N170 amplitude was measured by the average of a 40 ms time window centered on 
the peak latency. Mixed-design ANOVAs with Greenhouse-Geisser correction were 
performed on the latency and amplitude of N170, with group as between-subject 
factor, and trial type (congruent, baseline), laterality (left hemisphere: P7, PO7, POS, 
and right hemisphere: P8, PO8, PO6) as within-subject factors. Significant main effect 
of trial type was found for N170 latency, ((F(1,25) = 6.22, p = .02, 7°= .199). In 
comparison to baseline condition, congruent condition induced longer N170 latency 
(212 vs. 206 ms, 1(26) = 2.51, p = .019). For the N170 amplitude, the main effect of 
trial type is significant ((F(1,25) = 8.06, p = .009, n? = .244). When compared with 
baseline, congruent trials evoked larger N170 amplitude (-2.54 vs. -1.28 uV). The 
main effect of laterality is marginally significant ((F1,25) = 4.14, p = .053, 7°= .142), 
with greater N170 amplitude on the left hemisphere (-2.88 vs. -0.93 uV). The main 
effect of group is significant (F(1,25) = 6.52, p = .017, 7°= .207), with larger N170 
amplitude in the TD group (-3.67 vs. -0.15 uV). No significant interaction was found 
for N170. 

N300 

N300 was broadly distributed in the fronto-temporal area in the AVO condition. 
The mean amplitude of N300 was measured in a 260-340 ms time window. Mixed- 
design ANOVAs with Greenhouse-Geisser correction were performed on the average 
N300 amplitude, with group as between-subject factor, and trial type (congruent, 
baseline), laterality (left hemisphere: F7, F5, F3, midline: FP1/2, FPZ, AF3/4, F1/2, 
FZ, and right hemisphere: F8, F6, F4) as within-subject factors. The main effect of 
trial type is significant (F(1,25) = 27.28, p < .001, = .522), with larger N300 


amplitude in the baseline condition (-4.55 vs. -6.83 uV). The main effect of laterality 
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is significant (F(2, 50) = 3.63, p = .034, 7°= .127). In comparison to the right 
hemisphere, the N300 amplitude was larger on the midline (-6.04 vs. -4.9 uV, 426) = 
2.3, p = .03). No other significant main effect or interaction was significant. 
3.2 AV300 condition 

Grand-average ERPs in response to the onset of auditory stimuli for all 
conditions and groups are plotted in Figure 4 for representative electrodes. A fronto- 
temporal negativity peaked around 200 ms was found in the AV300 condition. The 
mean N200 amplitude was measured in a 170-230 ms time window. Mixed-design 
ANOVAs with Greenhouse-Geisser correction were performed on the average N200 
amplitude, with group as between-subject factor, and trial type (congruent, baseline), 
laterality (left hemisphere: F7, F5, FT7, FC5, midline: AF3, FP1, FP2, FPZ, AF4, and 
right hemisphere: F6, F8, FC6, FT8) as within-subject factors. The main effect of trial 
type was significant (F(1,24) = 19.78, p < .001, 7°= .452), with greater amplitude in 
the congruent condition (-6.83 vs. -4.55uV). The main effect of laterality was 
significant (F(2, 48) = 18.19, p < .001, 77=.431), with largest amplitude on the 
midline (ps < .001). There is no difference between the amplitudes on the left 
hemisphere and right hemisphere. The interaction of trial type, laterality and group 
was significant (F(2, 48) = 3.29, p = .046, 7°= .121). Follow-up ANOVAs showed 
significant group effect (F(1,24) = 4.29, p = .049, 7°= .151) and interaction between 
trial type and group (F(1,24) = 4.77, p = .039, n° = .166) on the right hemisphere. In 
comparison to the DD group, the TD group showed larger N200 amplitude on the 
right hemisphere (-2.65 vs. -0.52 uV). Furthermore, for the TD group, congruent trials 
evoked larger N200 amplitude as compared with baseline on the right hemisphere (- 
3.50 vs. -1.80 uV, 1(12) = -3.69, p = .003). But no difference was found in the 


amplitudes between the two conditions for DD group. 
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3.3 AV600 condition 

Grand-average ERPs in response to the onset of auditory stimuli for all 
conditions are plotted in Figure 5 for representative electrodes. Similar as in the 
AV300 condition, a fronto-temporal N200 was found in the AV600 condition. Mean 
amplitude of N200 was measured in a 150-250 ms time window. Mixed-design 
ANOVAs with Greenhouse-Geisser correction were performed on the average 
amplitude, with group as between-subject factor, and trial type (congruent, baseline), 
laterality (left hemisphere: F7, F5, FT7, FC5, and right hemisphere: F6, F8, FC6, 
FT8) as within-subject factors. The main effect of trial type was significant (F(1,24) = 
9.02, p = .006, 7° = .273). In comparison to baseline, congruent condition evoked 
larger N200 amplitude (-1.25 vs. -0.12 uV). The main effect of group was significant 
(F(1,24) = 5.52, p = .027, m°= .25), with larger amplitude in the TD group (-1.52 vs. 
0.14 pV). No other significant effect was found. 
3.4 Relationships between the measures of integration effect and reading-related 
skills 

To further investigate whether the congruency effects (differences between ERP 
amplitudes in the congruent and baseline conditions) correlate with individual 
reading-related skills, we calculated partial correlation coefficients (controlling for 
age, IQ, and vocabulary size) between congruency effect and scores of reading skills 
in the total population (both groups, adopting two-tailed significance level of a = 
0.05) and in each group (adopting two-tailed significance level of a = 0.05) 
separately. Since the present study focuses on the SOA effect on audiovisual 
integration in different groups, we present analyses separately for the TD and DD 


groups. 
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In the AVO condition, different correlations were found in the two groups. In the 
TD group, the congruency effect on Plamplitude was strongly correlated with reading 
fluency (r = .67, p = .012) and RAN (r= -.66, p = .011) on the right hemisphere. The 
congruency effect on P1 latency was significantly correlated with reading accuracy (r 
= .613, p = .044) on the right hemisphere. For the congruency effects on Pl amplitude 
or latency, positive values reflected greater congruency effects. The correlation 
coefficient between congruency effect on Pl amplitude and reading fluency, between 
congruency effect on P1 latency and reading accuracy were positive, indicating that a 
larger congruency effect was associated with higher reading fluency and reading 
accuracy. The correlation coefficient between congruency effect on Pl amplitude and 
RAN was negative, indicating that a larger congruency effect was associated with 
higher naming speed (smaller scores in RAN test). The correlation between the 
congruency effect on N300 amplitude and visual spatial attention was significant on 
the right hemisphere (7 = .613, p = .044). For the congruency effects on N300 
amplitude, positive values reflected greater congruency effects. The positive 
correlation coefficient means that a larger congruency effect was associated with 
higher visual spatial attention capacity. In the DD group, significant correlation was 
found between the congruency effect on Pl latency and morphological awareness (r = 
.662, p = .037) on the right hemisphere, suggesting that greater congruency effect was 
associated with better morphological awareness. A significant correlation was found 
between the congruency effect on N300 amplitude and RAN (r = .707, p = .022) on 
the right hemisphere. Since larger scores in RAN test means lower naming speed, the 
positive correlation coefficient between congruency effect on N300 amplitude and 
RAN reflecting that greater congruency effect was associated with slower naming 


speed. 


20 


In the AV300 condition, the correlation between the congruency effect on N200 
amplitude and RAN was strong and significant (r = .686, p = .028) on the left 
hemisphere. For the congruency effects on N200 amplitude, negative values reflected 
greater congruency effects. The positive correlation coefficient between congruency 
effect on N200 amplitude and rapid naming means that the greater congruency effect 
was associated with higher naming speed. In the DD group, the correlation between 
the congruency effect on N200 amplitude and visual spatial attention was significant 
(r = -.65, p = .042) on the left hemisphere. The negative correlation coefficient 
between congruency effect on N200 amplitude and visual spatial attention in the DD 
group means that the greater congruency effect was associated with higher visual 
spatial attention capacity. No significant correlation was found between the 
congruency effect on N200 amplitude and reading-related skills for both groups in the 


AV600 condition. 


4. Discussion 

The current study investigated electrophysiological correlates of cross-modal 
character-speech sound processing in developmental dyslexia and typically 
developing children in different SOA conditions. Different neural correlates were 
found in the temporal synchrony and temporal asynchrony conditions. Congruency 
effects were found for both groups across all the three SOAs. Difference in 
congruency effects between the DD and TD children was only found on the right 
hemisphere in the AV300 condition, where TD children exhibited significant 
congruency effect and children with dyslexia did not. The main findings are discussed 
in turn below. 


4.1 Electrophysiological indices of congruency effects in the AVO condition 
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In the AVO condition, the concurrently presented stimuli induced Pl and N170 
components in the occipito-temporal area and N300 in the fronto-temporal area. 
Congruent trials induced longer P1 latency, longer N170 latency and larger N170 
amplitude in comparison to baseline. The Pl component has been linked with 
physical characteristic of stimuli and selective attention (Hillyard, Teder-Sälejärvi & 
Miinte, 1998). The N170 was associated with the activity of visual word form area 
and was sensitive to orthographic information (Cohen et al., 2000; Sehyr, Midgley, 
Holcomb, Emmorey, & Behrmann, 2020). The increased latencies of Pl and N170 
have been linked to a delay in visual processing speed (Bieniek, Frei, & Rousselet, 
2013), suggesting that the large sensory processing load in the early stage of 
audiovisual integration will slow down the visual processing speed. We speculated 
that the greater N170 amplitude indicates increased cognitive processing load and 
more orthographic processing effort. As Pl and N170 reflect early processing of 
visual stimuli, the observed congruency effects on Pl and N170 reflect the influence 
of speech sound on visual character processing, but not the influence of visual 
character on speech sound processing. 

A fronto-temporal N300 congruency effect was found in a later stage, with 
congruent trials inducing smaller N300 amplitude in comparison to baseline. The 
latency and distribution of N300 resemble previously reported N300 in tasks requiring 
access to phonological representations of visual words (Bentin, Mouchetant-Rostaing, 
Giard, Echallier, & Pernier, 1999; Spironelli & Angrilli, 2007). In children, N300 was 
found to be associated with the integration of orthographic and phonological 
representations over the fronto-temporal lobe (Penolazzi, Spironelli, Vio, & Angrilli, 
2006). Lower N300 amplitude in congruent condition was thought to reflect 


successful perceptual integration or the access of mental representation (Male & 
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Gouldthorp, 2020), which was consistent with the finding on N300 congruency effect 
in this study. In the study by Hasko, Bruder, Bartling and Schulte-Körne (2012), 
auditory stimuli were presented before visual word, left-lateralized word-speech 
integration on N300 was found in TD children and bilateral word-speech integration 
on N300 was found in DD group. Such left-lateralized N300 congruency effect was 
not found in the present study, probably because visual characters and speech sounds 
were presented synchronously in the AVO condition. The difference demonstrates the 
important influence of temporal asynchrony on the neural mechanism of print-speech 
sound integration. 
4.2 Electrophysiological indices of congruency effects in the AV300 and AV600 
conditions 

In the AV300 and AV600 conditions, auditory speech sounds induced a fronto- 
temporal N200 component. In both conditions, congruent trials induced larger N200 
amplitudes in comparison to baseline. The N200 component has been related to 
several functional mechanisms, such as working memory processes (Sams, 1983), the 
acquisition of implicit knowledge (Tatiana et al., 2014), selective pre-attentive 
stimulus evaluation and stimulus discrimination (Howe, 2014; Polich, 2007). In 
children, fronto-central N200 responses induced by auditory stimulus are thought to 
represent encoding of acoustic properties in both primary and secondary auditory 
cortices (Ponton, Eggermont, Kwong, & Don, 2000). We tentatively speculate that the 
N200 component may reflect online encoding of acoustic information since it was 
induced by the onset of auditory speech sound. 

In the AV300 condition, dyslexic children revealed congruency effect over left 
fronto-temporal electrode sites, whereas TD group revealed bilaterally distributed 


congruency effect. This difference may be caused by the well-documented visual 


23 


perceptual difficulties (Cornelissen, Hansen, Hutton, Evangelinou, & Stein, 1998; 
Livingstone, Rosen, Drislane, & Galaburda, 1991) or auditory perceptual difficulties 
(Helenius, Uutela, & Hari, 1999) in dyslexic readers. Numerous studies have 
indicated that dyslexics exhibited a visual processing difficulty, particularly in rapid 
visual temporal processing (Lovegrove, Slaghuis, Bowling, Nelson, & Geeves, 1986). 
Specifically, dyslexic readers exhibited an asymmetric sluggish attention capture in 
the left visual field (Facoetti & Molteni, 2001; Jaskowski & Rusiak, 2008), which 
may result in insufficient visual processing of character on the right hemisphere. The 
insufficient character processing in turn contributed to the absence of N200 
congruency effect on the right hemisphere in DD group. Another possible reason for 
the absence of N200 congruency effect on the right hemisphere in dyslexic children 
may be their asymmetrical deficit in auditory processing. Previous data have shown 
reduced phase entrainment by the dyslexic participants in right hemisphere auditory 
networks (Cutini, Szücs, Mead, Huss, & Goswami, 2016), and impaired neuro-electric 
oscillations in Theta and Delta band on the right hemisphere (Goswami, 2011). Such 
deficits may explain the weak N200 response on the right hemisphere in dyslexic 
children. 

Multisensory integration starts from the cross-modal inputs to sensory cortices 
(Cappe, Thut, Romei, & Murray, 2010; Mercier & Cappe, 2020). As each sensory 
system has its own processing speed (Breznitz, 2006), the gap between visual 
processing speed and auditory processing speed may affect the quality of integration 
(Shaul, 2014). In the AV600 condition, the temporal disparity between visual 
characters and speech sounds was relatively long, allowing the dyslexic children to 
process visual characters more sufficiently, thus leading to equivalent N200 


congruency effects on both hemispheres. In the AV300 and AV600 conditions, the 
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congruency effects on N200 reflect the influence of visual characters on the 
processing of speech sounds. In the AV300 and AV600 conditions, different findings 
on the laterality of N200 congruency effect were found in dyslexic children, 
demonstrating the important influence of visual information on speech sound 
processing. 
4.3 Relationships between the neural correlates of congruency effects and 
reading skills 

The relationships between neural correlates of congruency effects and behavioral 
measures of reading-related skills were different in the two groups. In the AVO and 
AV300 conditions, reading-related skills correlated with the congruency effects in 
expected direction in TD children, with better readers showing greater congruency 
effects. In the AVO condition, RAN correlated with N300 congruency effect in the 
opposite direction in dyslexic children, suggesting the greatest congruency effect for 
the poorest readers. The TD children revealed significant correlation between visual 
spatial attention and congruency in the AVO condition, whereas dyslexic children 
revealed significant correlation between visual spatial attention and congruency effect 
in the AV300 condition. The slow involvement of visual spatial attention may confirm 
a deficit in attention capture in dyslexic children (Facoetti & Molteni, 2001; 
Jaskowski & Rusiak, 2008). No significant correlation between reading-related skills 
and congruency effect was found in the AV600 condition, suggesting that the 
influence of reading-related skills on neural congruency effects depends on the 
temporal relationship between cross-modal stimuli. In general, the neural correlates of 
congruency effects can index reading-related skills in TD children. But the 
relationships between reading-related skills and the neural correlates of congruency 


effect in dyslexic children were more complicated. 
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4.4 Limitations of the present study 

Lastly, it is important to point out possible limitations of our study. One 
limitation is the small sample size of the subgroups. In future, the study should be 
repeated with a larger sample size in order to gain power and to assess whether the 
results are replicable and powerful. The second limitation is that only the influence of 
speech sounds on processing of visual characters was observed in the AVO condition. 
The ERP responses may be visual-dominant in the passive viewing and listening task, 
future study should employ an active task to direct participant’s attention to the 
speech sounds. 
Conclusion 

The present study found different neural correlates of character-speech sound 
integration in the temporal synchrony and temporal asynchrony conditions. 
Congruency effects on Pl, N170 and N300 reflect the influence of speech sound on 
visual character processing in the AVO condition. Congruency effects on N200 in the 
AV300 and AV600 conditions reflect the influence of visual character on speech 
sound processing. Compared with TD children, dyslexic children showed insufficient 
character-speech integration in the AV300 condition, suggesting that character-speech 
sound integration deficit in dyslexic readers was modulated by the temporal 


relationships between visual and auditory stimuli. 
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Table 1. Mean (SD) age and scores on reading-related skills in developmental dyslexia (DD) and 


typically developing (TD) groups 


Gender(male/female) 
Age(years) 
CRT! 
CCRT # 
Reading fluency 
Reading accuracy 
Morphological awareness 
Phonological awareness 
Rapid naming 
Short memory 


Visual spatial attention 


‘CRT combined Raven’s test 


TD 
group (n=14) 
M (SD) 


10/4 
10.19(0.69) 
114.85(11.31) 
2708.69(380.69) 

93.38(16.19) 
98.69(10.60) 
16.08 (2.10) 
28.23 (1.74) 
15.31(2.09) 
6.64(0.94) 


31.67(5.52) 


*CCRT Chinese character recognition test 


*p<0.05, **p<0.01, ***p<0.001 
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DD group(n=13) 


M(SD) 
10/3 


10.19(0.66) 


106.92(11.16) 


1917.34(530.00) 


77.38(16.59) 
87.31(9.38) 
14.00(1.87) 
23.92(3.55) 
17.88(2.62) 
6.15(1.44) 


27.23(6.43) 


T value 


0.01 
1.8 
4.378" 
2.49 
2.90" 
2.66 
3.93 
-1.76 
-0.293 


-1.844 
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Table 2 Mean number of epochs in the baseline and congruent conditions in each group and each 


SOA 
es AVO AV300 AV600 
M(SD) M(SD) M(SD) 
Baseline 60.46(12.39) 56.69(14.42) 52(17.68) 
TD (N= 14) 
Congruent 62.85(11.10) 57.15(13.52) 52(16.41) 
DDIN=13 Baseline 67.08(10.56) 62.23(11.77) 59.15(13.56) 
ee Congruent 66.62(11.86) 62.38(11.94) 58.69(14.51) 
A li 
Congruent Baseline 
Visual stimulus mm 
Mac) VER 
Audio stimulus mu4 mu4 
B 


500ms 


500-1000ms 


Figure 1. Schematic description of the experimental design. (A) Shows an example of 
cross-modal stimuli pairs in congruent and baseline conditions. Visual stimuli were 
either written Chinese characters or Korean symbols. Auditory stimuli were 
pronunciations of a Chinese character. (B) Illustrates the stimulus sequences within an 


experimental trial. Each trial starts with a central fixation presenting for 500 ms, 
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followed by a visual stimulus with the duration of 1000 ms. Auditory stimulus was 
presented synchronously or after a 300 ms or 600 ms time interval. The visual 


stimulus was followed by a random blank interval of 500-1000 ms. 
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Figure 2. The layout of the configuration of electrodes and selection of electrodes for 
data analysis. In the AVO condition, the electrodes marked by blue dots are selected 
for the analysis of P1 and N170. The electrodes marked by red dots are selected for 
the analysis of N300. In the AV300 and AV600 condition, the electrodes marked by 


red dots are selected for the analysis of N200. 
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TD-congruent 
TD-baseline 
DD-congruent 
DD-baseline 


180-250ms 


1” 
i 


260-330ms 


-10 


TD-baseline TD-congruent DD-baseline DD-congruent 


Figure 3. Grand average ERPs and topographic graphs in response to the auditory 
onset under the AVO condition. Multisensory integrations were involved in the time 
windows of 110-180 ms, 180-250 ms, and 260-340 ms. The early Pl and N170 can be 
observed on PO7/8 electrodes, and fronto-temporal N300 can be observed on the 


AF3/4 electrodes. 
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TD-baseline — 
DD-congruent — 
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| 170-230ms 


TD-baseline TD-congruent DD-baseline DD-congruent 


Figure 4. Grand average ERPs and topographic graphs in response to the auditory 
onset under the AV300 condition. Multisensory integrations were involved in the time 


windows of 170-230 ms. 
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Figure 5. Grand average ERPs and topographic graphs in response to the auditory 
onset under the AV 600 condition. Multisensory integrations were involved in the time 


windows of 150-250 ms. 
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